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Abstract — We introduce a new perspective into the field of 
quantitative information flow (QIF) analysis that invites the 
community to bound the leakage, reported by QIF quantifiers, 
by a range consistent with the size of a program's secret input 
instead of by a mathematically sound (but counter-intuitive) upper 
bound of that leakage. To substantiate our position, we present a 
refinement of a recent QIF metric that appears in the literature. 
Our refinement is based on slight changes we bring into the design 
of that metric. These changes do not affect the theoretical premises 
onto which the original metric is laid. However, they enable the 
natural association between flow results and the exhaustive search 
effort needed to uncover a program's secret information (or the 
residual secret part of that information) to be clearly established. 
The refinement we discuss in this paper validates our perspective 
and demonstrates its importance in the future design of QIF 
quantifiers. 

Index Terms — computer security, quantitative information flow, 
information theory, uncertainty, inference, program analysis 

I. Introduction 

The goal of information flow analysis is to enforce limits 
on the use of information that apply to all computations that 
involve that information. For instance, a confidentiality property 
requires that a program with secret inputs should not leak 
those inputs into its public outputs. Qualitative information flow 
properties, such as non-interference are expensive, impossible, 
or rarely satisfied by real programs: generally some flow exists, 
and many systems remain secure provided that the amount 
of flow is sufficiently small, moreover, designers wish to 
distinguish acceptable from unacceptable flows. 

Systems often reveal a summary of secret information they 
store. The summary contains fewer bits and provides a limit 
on the attacker's inference. For instance, a patient's report is 
released with the disease name covered by a black rectangle. 
However, it is not easy to precisely determine how much 
information exists in the summary. For instance, if the font 
size is uniform on the patient's report, the width of the black 
rectangle might determine the length of the disease name. 
Quantitative information flow (QIF) analysis is an approach 
that establishes bounds on information that is leaked by a 
program. In QIF, confidentiality properties are also expressed, 
but as limits on the number of bits that might be revealed from 
a program's execution. A violation is declared if the number 
of leaked bits exceeds the policy. Because information theory 
forms the foundation of QIF analysis, it should be possible 
to associate the quantities reported by QIF quantifiers with 
the effort needed to uncover secret information via exhaustive 
search. However, establishing this association is infeasible with 
QIF quantifiers that do not report a flow consistent with the 



size of a program's secret input, but instead a mathematically 
sound upper bound of that flow |T). For instance, consider 
the QIF metric and the password checker in Section 1 of (TJ, 
and assume that the password space has a cardinality of 3. 
This means that the size of the password is log 3 = 1.5849 
bits. (Here and hereafter, all logarithms are to the base 2). 
Nonetheless, the metric in [ 1 ] might report a flow that exceeds 
1.5849 bits, which makes it impossible to determine the space 
of the exhaustive search that should be carried out in order to 
reveal the residual secret part of the password. However, if the 
flow reported is always less than 1.5849 bits, the exhaustive 
search space becomes evident. 

We believe that the counter-intuitive flow quantities reported 
by some QIF quantifiers, that appear in the literature, are due 
to a flaw in the design of those quantifiers, and that simple 
tweaks can bound those quantities by a range consistent with 
the size of a program's secret input. This paper takes the first 
step in this direction and refines the QIF metric suggested in 
(T). The metric in [1| is based on a new perspective for QIF 
analysis. The fundamental idea is to model an attacker's belief 
about a program's secret input as a probability distribution over 
high states. This belief is then revised, using Bayesian updating 
techniques, as the attacker interacts with a program's execution. 
It is believed that the work reported in [1| is the first to address 
an attacker's belief in quantifying information flow. This work 
was later expanded and appeared in . A number of relevant 
results 0, J4| were reported in the sequel; however, the work 
in Q], El is sufficient as a foundation of our work. 

A. Plan of the Paper 

The remainder of this paper is organized as follows. Sec- 
tion [Fl] elaborates on accuracy -based information flow analysis 
which is the major contribution in JJJ. In this section, we 
give concise elucidation of the elements of this analysis and 
how it differs from the classical uncertainty-based information 
flow analysis. In addition, we uncover some inexplicable results 
reported by the QIF metric in JTJ, and argue that the reasoning 
of this metric's designers is incomplete. We further state the 
general range of flow reported by the metric in [fl] that applies 
to both deterministic and probabilistic programs as well as 
to all types of attacker's beliefs. This range is neither given 
in UJ nor in (2). Over the course of acquiring the range, 
we reveal the ineffectiveness of the admissibility restriction 
suggested in JJJ. At the end of Section HH we conjecture a 
simple fix that can bound the results reported by the metric 
in UJ. Underpinning our arguments in Section [XT] is a formal 
definition of a size-consistent QIF quantifier. Our definition 



is based on uncertainty-based information flow analysis, and 
it inaugurates the new perspective we are introducing into 
the field of QIF. To the best of our knowledge, this is the 
first definition to capture the correlation between the size 
of a program's secret input and the quantification of flow 
from that input in the general case. Section [ill] concentrates 
on Kullback-Leibler divergence which is a centerpiece of the 
metric in [1|. We give some mathematical interpretations of 
this divergence, and then focus on its discrimination construct, 
suggesting the replacement of this construct with a better 
one, and subsequently the replacement of the divergence itself 
with another, bounded, divergence. This paves the way for 
the refinement of the metric in JT] which is what we fulfill 
in two stages in Section [IV] We also give the range and the 
interpretation of the refined metric, and prove its properties 
and their meaningfulness compared to the original one, while 
minding the consistency of the probability distributions dealt 
with. Having justified the conjecture we made in Section [Til 
and shown that a large number of possible refinements of the 
metric in (T) exist, we discuss the association of the original and 
the refined metric with the exhaustive search effort in Section 
[Vl give some remarks in Section |VI] and conclude the paper 
in Section IVIII The proofs are given in Appendix Q] 

II. Uncertainty- vs. Accuracy-based Information 
Flow Analysis 

The problem with uncertainty-based information flow analy- 
sis is that it ignores reality. As an example, consider a simple 
password checker VWC [1] that sets an authentication flag a 
after checking a stored password p against a guessed password 
g supplied by the user. 

VWC : if p = g then a := 1 else a := (1) 

For simplicity, suppose that the password space is W p — 
{A,B,C}, which gives a size of log|W p | = log3 = 1.5849 
bits for the password p. Suppose further that the user is 
actually an attacker attempting to discover the password. Before 
interacting with a VWC execution, this attacker believes that 
the password is overwhelmingly likely to be A but has a very 
small and equally likely chance to be either B or C. More 
concretely and adopting the convention in [1|, the attacker's 
prebelief about p is captured using a probability distribution 
ba ■ W p — > [0, 1] as shown in Table Hal 



V 


A 


B C 


b H 


0.98 


0.01 0.01 



p A B C 



(a) Attacker's prebelief (b) Attacker's postbelief 

TABLE I: Attacker's beliefs in the password p 

The attacker's uncertainty about p (not necessarily about 
the correct p) is obtained via a simple application of Shannon 
uncertainty functional 0: 

U = S{b H ) = -0.981og0.98- 2 • O.OllogO.Ol = 0.1614 bits 



Assuming that the correct password (the reality) is C, if the 
attacker complies to her prebelief and feeds a VWC execution 
with g = A, she will observe a equal to 0. The attacker then 
infers that A is not the real password, and that there is an equal 
chance of 50% that the password is either B or C. As a result, 
the attacker's postbelief distributes as shown in Table [lb] and 
the attacker's uncertainty about p becomes: 

U = S{b H ) = -0.51og0.5 - 0.51og0.5 = 1 bit 

To complete an uncertainty-based information flow analysis, 
we have to compute the reduction in uncertainty by subtracting 
the post- from the pre-uncertainty using the formula: 

Ti = U-U' 

This gives us TZ = 0.1614 - 1 = -0.8386 bits. In the 
sense of uncertainty-based analysis, the negative TZ means 
absence of information flow. There is nothing wrong with this 
interpretation provided that we do not connect information flow 
with how far an attacker's belief is from reality. However, if 
we connect the flow with the distance between an attacker's 
belief and reality, then the interpretation that TZ supports does 
not make sense. The measure TZ ignores reality by measuring 
bn and b H against each other only, instead of against the high 
state (which is C as the correct password in our example). It 
is good to notice however that the range of flow reported by 
TZ is as given by the formula: 

Qn = [-log|Wp|,log|W p |] = [-1.5849,1.5849] 

This is a direct consequence of Shannon uncertainty func- 
tional falling in the range [0, log|W p |] 0. The range q-r 
reported by TZ is plausible if we remember that the size of the 
password p is 1.5849 bits. We would like to take time defining 
the size-consistent QIF quantifier. 

Definition I ( Size-consistent QIF Quantifier): We say that a 
QIF quantifier is size-consistent if its reported results are 
bounded (from above and from below) by the size of a pro- 
gram's secret input. Formally, let QJAAN be a QIF quantifier, 
and assume that the size of a program's secret input is 77 bits. 
We say that QJAAN is size-consistent if: 

QUAMmax < V and QUAN min > -f] 

However, if we merely look at the attacker's prebelief and 
postbelief in C, as the correct password, we realize that the 
attacker's belief has approached reality from interacting with 
VWC. Approaching reality cannot happen unless the attacker 
learns something from an amount of information VWC has 
conveyed. This conveyance corresponds to positive informa- 
tion flow that informs the attacker, and flatly contradicts the 
uncertainty-based interpretation. 

The earliest investigation of this specific inadequacy of 
uncertainty-based information flow analysis appeared in flTJ and 
was later expanded in [2 |. The authors of [2 | propose to respect 
reality through what they call "accuracy-based information flow 
analysis". This sort of analysis has two elements: 
El. Quantifying information flow from a program's execution 
to an attacker. 



E2. Respecting the distance between an attacker's belief and 
reality. 

The uncertainty-based analysis does not have the second el- 
ement as the example above demonstrated. The accuracy-based 
analysis quantifies flow as the improvement in the accuracy of 
an attacker's belief. This is equivalent to saying the reduction 
in the distance between an attacker's belief and reality. The 
metric advanced in [2| is based on this notion of improvement, 
and is given by the formula: 



Q(£, b H ) = D{b H -> a H ) - D(b' H ->■ & H ) 



(2) 



where £ = (S,bH,ajj,crL) is an experiment tuple as defined 
in O, (£,b H ) is the outcome of that experiment, bn is the 
attacker's prebelief, b H is the attacker's postbelief, &h is a 
probability distribution that maps the high state <jh to 1 (this 
is the certainty about the high state; about reality), and D is 
Kullback-Leibler divergence (also known as relative entropy or 
information gain [6|) given by the formula: 



D(b ->&') = ' lo ' 

<rew„ 



b(a) 



(3) 



Notice in formula (O how Q respects reality by measuring 
bn and b H against the correct high state &h, instead of against 
each other only. Formula d2} is simplified in (2) to (this 
simplification is reality-aware): 



Q(£, b H ) = D{b H -> & H ) - D(b H -»■ & H ) 
= E ^M-logffg} 

E i°g?g 

= - log b H (a H ) + log b' H (a H ) 



(4) 



To complete an accuracy-based information flow analysis 
parallel to the uncertainty-based analysis we have completed 
earlier in this section, we apply formula (@) to the same example 
given above to obtain: 



Q(£ , b H ) = - log 0.01 + log 0.5 = 5.6438 bits 



(5) 



The flow value of 5.6438 bits reported by Q violates the 
plausible range q-ji = [—1.5849,1.5849] and equally exceeds 
the size needed to store the password p. How can a flow from 
p exceed the size needed to store pi A sound but puzzling 
result in the field of QIF analysis that the authors of (2) 
attribute to that the attacker's prebelief is not uniform; it is 
more erroneous than a uniform belief ascribing 1/3 probability 
to each password A, B, and C, and therefore a larger amount 
of information is required to correct it! But what can the source 
of this larger amount of information be? Is it a covert agent 
external to the system and the attacker when all the agents are 
assumed condensed to just the attacker and the system (2)? 
Besides is it always true that a uniform attacker's prebelief 
would, in a series of experiments, cause her to learn a total of 
log 3 bits [2 J? This claim is valid for a deterministic password 
checker, but incomplete for a probabilistic one. Let us verify 
this fact. 



It is proved in J2 that for deterministic programs (including 
the deterministic VWC given in formula ((TJ), we have: 

bnio-H) < b H {(7 H ) (6) 
Since b H is a probability distribution, we can write: 

bij (ph ) < b H (a H ) < 1 

which means: 

log < logb H (<r H ) < 
0< Q < -logb H {a H ) 

The attacker's prebelief is assumed uniform on W p , therefore: 

< Q < log3 

Thus, it is beyond a shadow of a doubt that a uniform 
attacker's prebelief would cause her to learn a total of log 3 
bits from interacting with a deterministic VWC. But does 
the attacker's learning outcome differ when interacting with 
a probabilistic VWC1 An illustrative probabilistic VWC is: 



VVWC : if p 



g then a 
else a :- 



'- 1 0.99 
0o.99 [ 



a 
a 



: 

1 



The inequality in formula © no longer holds, and we are 
free to write: 



— oo < — log b 



< b H (<j H ) < 1 
logb H (<th) +^ogb' H (o- H ) < -\ogb H {a H ) 
-oo < Q < or < Q < log 3 



The sub-range — oo < Q < shows that a uniform attacker's 
prebelief might cause her to learn an infinite number of misin- 
forming bits from interacting with VVWC. This demonstrates 
the incompleteness of the claim "a uniform attacker's prebelief 
would, in a series of experiments, cause her to learn a total of 
log 3 bits" made in Q. 

The previous discussion motivates the investigation of the 
general range of the Q metric that holds with both deterministic 
and probabilistic programs as well as with all types of attacker's 
beliefs. This range is attained in Lemma Q] 

Lemma 1: Considering both deterministic and probabilistic 
programs, and all types of an attacker's beliefs, the general 
range of flow reported by Q is: 

q q = (-oo,-logb H (a- H )] 

Clearly Q is not size-consistent. Let us now muse on the 
computation in formula (0 and try to figure out a mean to 
proceed with this correspondence. The flow of 5.6438 bits has 
brought the attacker from — log 0.01 = 6.6438 bits away from 
reality to — log 0.5 = 1 bits away from it. In addition and as 
proved in Theorem 3 in El. each bit of flow has made the 
attacker twice as likely to guess correctly Q, or equivalently 
twice as certain about the correct high state (in total, we have 
25.6438 tj mes increase in the likelihood of a correct guess). 
In the uncertainty-based definition, the attacker's certainty is 
ascribed to a high state that might be incorrect.. . Conjecture Q] 
engrossedly stops the correspondence. 



Conjecture 1: Considering Theorem 3 in 0, if a bit of flow 
makes the attacker more than twice as likely to guess correctly, 
then Q should become size-consistent. 

Seeking a justification for this conjecture will be the purpose 
of the later sections. Although the authors of [T], are 
acclaimed for their contribution to the field of QIF through 
their accuracy-based analysis, their metric allows the respect 
for reality (element E2) to attenuate the quality of flow quan- 
tification (element El). This attenuation is the result of severe 
discrimination in Kullback-Leibler divergence as we shall see 
in the next section. 

III. Concentrating on Kullback-Leibler 
Divergence 

A. Possible Interpretations of the Divergence 

The divergence D between b and b', given in formula (O, 
can be interpreted in terms of code inefficiency as follows; D is 
the average number of bits that are wasted by encoding events 
from a distribution b' with a code based on a not-quite-right 
distribution b 0. Another way of writing D in terms of the 
expected value function (9) is as follows: 

D(b -> b') = ^(log^M), E b ,{f) = n*) ■ m 

The function Ey takes the weighted average of the values 
f(a) in which the weights are probabilities b' . In the original 
paper by Kullback and Leibler iflOl . the values: 

l Dis {a) = log^ ^ 

are seen as the information in a for the discrimination between 
b and b'. This is plausible if we rewrite the previous values as: 

- log b(a) -(- log b'(a j) 

and recall that the information contained in an observation of 
an event E with probability p(E) is — \ogp(E) 1151 . 

This notion of discrimination leads to another interpretation 
of D; it is the weighted average of the information in a for 
the discrimination between 6 and b 1 where the weights are 
probabilities b'. We write: 

D(b^b')=E b ,(l ms (a)) (8) 

B. A Better Discrimination Construct 

We propose to replace the discrimination construct in formula 
([8j with the following: 

2 

for I Dis (a) to be the information in a for the discrimination 
between the mean (b' + b)/2 and b' . But what is the effect of 
this replacement? The following lemma shows that we have 
actually cut down the discrimination at least by half. 

Lemma 2: The proposed discrimination construct cuts down 
the discrimination in Kullback-Leibler divergence at least by 
half, that is: i Dls {u) < \T Dis {o). 



A graphical comparison between Td%s {p ) and I Dis (a) is 
shown in Figure [Ta] It is important to notice at this stage that 
halving the infinite value of Xdis (c) does not make it finite. 

C. A Better Divergence 

Substituting (O for ^ in (0, we get the divergence: 

D'(b^b')= E b V)- lo g TO^T < 10) 

The resulted divergence meets with the asymmetric form 
K of Jensen-Shannon divergence proposed in ifTTl . In fact, 
formula (0 and Lemma [2] both appear in [fTTI wrapped in the 
expected value function. D' is nonnegative and equals zero if 
and only if b = b' [11|. This is essential for any measure of 
difference and justifies using D' instead of D to measure the 
distance between two beliefs. A possible interpretation of D' 
is as follows; how much information is lost if we describe the 
two random variables that correspond to b and b' with their 
average distribution (b 1 + b)/21 This interpretation gives D 1 
the nickname "information radius" (S). 

A graphical comparison between D and D' is shown in 
Figure [Tb] Notice that D approaches infinity when t approaches 
or 1. In contrast, D 1 is always well defined in the entire range 
t e [0, 1]. This is because (&' + b)/2 ^ if either b' = or 
6 = 0. But what is the effect of using D' instead of D in Q? 
This will be our focus in the next section. 

IV. Refining the Metric 
A. Refining to Normalization 

If we substitute ( TTOb for © in 0, we get the metric: 

Q'{£ , b H ) = D'(b H -> & H ) - D'(b H -> & H ) 

= - log(l + b H {a H )) + log(l + b H {a H )) 

Notice that the above substitution does not destroy the 
bedrock of accuracy-based analysis which, as mention in Sec- 
tion HIl quantifies flow as the improvement in the accuracy of 
an attacker's belief. This guarantees that Q? is a real metric 
of information flow. Before proceeding any further, we need to 
investigate the general range of Q', which is what we do in 
Lemma [3] 

Lemma 3: Considering both deterministic and probabilistic 
programs, and all types of an attacker's beliefs, and avoiding 
the imposition of any admissibility restriction on those beliefs, 
the general range of flow reported by Q' is: 

QQf = [-1,1] 

Fortunately, the sub-range [—1,0] corresponds to the at- 
tacker's misinformation while the sub-range [0, 1] corresponds 
to the attacker's information about the correct high state. 

The new range qqi = [—1, 1], we have reached, does 
not make Q' size-consistent. Nonetheless, qq> is a plausible 
normalization (flow percentage) that is invariant with respect 
to the choice of the measurement unit. 



(a) Between T Dis (a) and X' Dls (a) (b) Between D and D' 

Fig. 1: Graphical comparisons made in the paper 



(c) Between Q and Q' 



B. Refining to Actuality 

To ensure bits as the measurement unit, and avoid the need 
to transform the flow results back and forth between the ranges 
QQ> = [-1,1] and g n = [-1.5849,1.5849], we let rj be the 
size of a program's secret input in bits, and define the refined 
metric as: 

Q"(£,b' H ) = r ) -Q'(£,b H ) 

= r)-[- log(l + b H {a H )) + log(l + b H {a H ))\ 

(11) 

A graphical comparison between Q and Q" in the case of 
VWC, along with the size-consistent uncertainty-based upper 
and lower bounds of flow, is shown in Figure [Tc] It is important 
to notice in this figure that the parts of the Q and Q" graphs 
that fall above the zero mark on the Y axis represent the 
attacker's information about the correct high state. In contrast, 
the attacker's misinformation is represented by the parts that 
fall below the zero mark on the Y axis. Another important 
observation to make in this figure is that, akin to Q, Q" is 
sensitive to changes in the attacker's belief. It is thus noted 
that Q" is a good quantifier of flow (element El) that adheres 
well to reality (element E2). 

C. Range of the Refined Metric 

The most celebrated property of the refined metric is prob- 
ably its range which is sought in Theorem Q] 

Theorem 1: Considering both deterministic and probabilistic 
programs, and all types of an attacker's beliefs, and avoiding 
the imposition of any admissibility restriction on those beliefs, 
the general range of flow reported by Q" is: 

QQ" = [~V ■ lo g(l + b H (<7H)),V ■ [1 - l°g(l + M°ff))]] 

where -q is the size of a program's secret input in bits. 

Corollary 1: Notice that log(l + bn (&h)) _ 1- This means 
that Q max < i] and Q. min > — rj, and makes Q" size-consistent. 

D. Interpreting the Refined Metric 

If we apply formula ([TO to the same example given in 
Section mi we get: 

Q"(£,b' H ) = 0.9044 bits 

This time, the flow of 0.9044 bits has brought the attacker 
from 

1.5849 • [1 - log(l + 0.01)] = 1.5621 bits 



away from reality to 

1.5849 • [1 - log(l + 0.5)] = 0.6577 bits 

away from it. But how much did this flow make the attacker 
likely to guess correctly? Theorem |2] answers this question, 
substantiating the validity of Conjecture Q] we made in Section 
[Til and showing that a bit of flow reported by Q" makes the 
attacker more than twice as likely to guess correctly. 

Theorem 2: A flow of k bits reported by Q" makes the 
attacker more than 2 fc as likely to guess correctly. Strictly 
speaking: 

Q"(£, b' H ) = k& b' H {a H ) = 2 k ^ ■ b H (a H ) + 2 k ^ - 1 (12) 
where r\ is the size of a program's secret input in bits. 

E. Consistency of the Probability Distributions 

The bounds of Q", given in Theorem [T] ensure proper 
bounds of b H . This can be easily shown by assuming a flow 
of k bits and proceeding as follows: 

-r) ■ log(l + b H (cr H )) <k<Tf-[l - log(l + b H (cr H ))} 
2 1os( TO^t) . (1 + b H {a H )) - 1 < b H {a H ) 
< 2 log( ™- H > ) • (1 + b H (a H )) - 1 
< b H (c7 H ) < 1 

However, this does not ensure that an intermediate value of 
Q" leads to b H falling outside the range [0, 1]. To ensure this, 
we need to show that Q" is a monotone function. This is done 
in Lemma |4] 

Lemma 4: Q" is a monotonically increasing function, that 

is: 

V6i,& 2 : h < b 2 Q"(£,h) < Q"(£,b 2 ) 

Thus, the probability distributions dealt with are invariably 
consistent. 

F. Meaningfulness of the Bounds 

We still have to accentuate the meaningfulness of the bounds 
of Q" in relation to the attacker's likelihood of a correct guess, 
or equivalently, to the attacker's certainty about the correct high 
state. This is done in Theorems [3] and |4] 

Theorem 3: An informing flow equal to the upper bound of 
Q" is sufficient to make a fully uncertain attacker fully certain 
about the correct high state. 



Corollary 2: Notice that, in the case of a fully uncertain 
attacker, we have: 

2lm(£> b H ) = -V ■ lo g(! + b H (cr H )) = -V ■ log 1 = 

This yields the absolute range qqh = [0, rf\ for Q", and 
reflects the rationality that a fully uncertain attacker can only 
be informed. 

Theorem 4: A misinforming flow equal to the lower bound 
of Q" is sufficient to make a fully certain attacker fully 
uncertain about the correct high state. 

A similar corollary to Corollary |2]can be stated to show that 
a fully certain attacker can only be misinformed. 

G. Other Refinements 

The discrimination construct, given in formula (|9), which 
we used in our refinement is definitely not the only apt 
construct. Any construct that reduces the discrimination is a 
likely candidate for the replacement of the Kullback-Leibler 
construct (given in formula (Q). For instance, consider the 
following discrimination construct: 



,(<j) = lo| 



1 + b'{a) 
1 + b{a) 



This construct clearly cuts down the discrimination. More- 
over, it leads to the same refinement that the construct in ((9) 
had led to. This shows that there is a large number of possible 
refinements of the Q metric. However, we favored the construct 
in © since the properties of Jensen-Shannon divergence are 
well-examined in the literature ifTTl . 

V. Exhaustive Search Effort 

Assuming a program with a secret input of size r\ bits, and an 
informing flow of k bits from the same program to an attacker. 
The dynamic upper bound of Q" , given in Theorem [T] tells 
us that k < ?/. Therefore, the space of the exhaustive search 
lfl2l that should be carried out in order to reveal the residual 
part 77 — k bits of the secret input is 2** . On the other hand, 
the dynamic upper bound of Q, given in Lemma Q] tells us 
that k > rj is a possible scenario. In scenarios as such, the 
residual part of the secret input is impossible to determine, and 
consequently, the exhaustive search space cannot be established, 
albeit that the secret input might have been partially revealed 
to the attacker (refer to the example in Section [II). 

VI. Remarks 

In addition to the divergence K, given in formula dlOl l. 
Lin IfTTl identified two other divergence measures. The first 
divergence is denoted as J, and is given by the formula: 

b'(a) 



J(b^b>)= ]T (b'(a)- b(a)) -log 



b(a) 



This divergence is the symmetric form of Kullback-Leibler 
divergence, given in formula and they both share the same 
problems; they are unbounded from above and undefined if 
b(a) = and b'(cr) ^ for any a e W p . It is therefore doubtful 
that the use of any of these two divergence measures would lead 



to size-consistent QIF quantifiers. The second divergence Lin 
identified is denoted as L, and is given by the formula: 

L(b^b') = 2S( b -±^)-S(b)-S(b') 

where S is Shannon uncertainty functional [5|. This divergence 
is the symmetric form of the divergence K we used in our 
refinement. It has an obvious information-theoretic interpreta- 
tion in terms of Shannon uncertainty functional which makes it 
suitable for use in accuracy-based information flow analysis 
when an attacker's belief about a program's secret input is 
modeled using advanced representations of uncertainty other 
than a simple probability distribution over high states. We leave 
the investigation of this use as future work. 

VII. Conclusions 

We presented a refinement of the QIF metric in JT], |]2] 
that bounds its reported results by a plausible range. Both the 
original and the refined metric are justified quantifiers of the 
flow that occurred during a program's execution. However, they 
differ in their interpretation of one bit of flow. Contrary to the 
original metric, the results reported by the refined metric are 
easily associated with the exhaustive search effort needed to 
uncover a program's secret information (or the residual secret 
part of that information). We believe that the counter-intuitive 
flow quantities reported by some QIF quantifiers, that appear in 
the literature, are due to a flaw in the design of those quantifiers. 
We further believe that this can be avoided by introducing minor 
changes into the design of those quantifiers. 
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Appendix I 
Proofs 



A. Proof of Lemma Q] 

Kullback-Leibler divergence given in formula (01 has the range: 

< D(b -> b') < +00 



which means that: 

-00 < D(b H 



> & H ) - D{b' H - 
-00 < Q < +00 



&h) < +00 



It could be safer to bring the reader around by showing the ex- 
treme cases. The extreme case from above Q = +00 is reached 
when &if((7fl-) = and b H {an) = 1, whereas the converse 
yields the extreme case from below Q = —00. An admissibility 
restriction is suggested in JT] on the attacker's prebelief. This 
restriction ensures that the prebelief never deviates by more 
than a positive factor from a uniform distribution, and is given 
by the formula: 



mm aH (b H (cr H )) > e 



1 



I States I 



;e > 



The restriction above more or less excludes the attacker's initial 
belief that certain states are impossible, or in other words, 
ascribing zero as a prebelief. However, it does not impose 
anything on the attacker's postbelief, which enables us to write: 

< b H {a H )) < 1 and < b' H (a H ) < 1 



and consequently: 

-00 < D(b H 



■> a H ) - D(b' H a H ) < +00 
-00 < Q < 



-00 



Notice how the admissibility restriction is weak in that it 
averts reporting infinite informing flow from the metric Q, 
while leaving the rest of the counter-intuitive results unattended 
(perhaps this explains why the admissibility restriction is given 
in the original work (T), but not in the expanded one [2|). We 
have yet to arrive at the general range of Q. The last word on 
this matter relates to the fact that the attacker's postbelief about 
the correct high state can neither be better than full certainty nor 
worse than full uncertainty. The former of these two arguments 
yields the dynamic upper bound of Q which corresponds to the 
maximum informing flow: 

Qmax(£,b H ) = - \ogb H (a H ) +logl = - log b H ((TH) 

whereas the latter of the two arguments yields the absolute 
lower bound of Q which corresponds to the maximum misin- 
forming flow: 

Q mn (£,b' H ) = - \ogb H (a H ) +logO = -00 

This gives us the general range of flow reported by Q: 

Qq = (-00, -\ogb H (a H )] 



B. Proof of Lemma [2] 

The inequality of the arithmetic and geometric means gives us: 
b'(a) + b(a) 



> ^b'{a)-b{o) 



Based on this, we can write: 



-Dis 



(CO = lOg y 



. b'(a) 



b'(a)+b(a) 
2 



< log ; v ' = -XnisW) 

C. Proof of Lemma \3\ 

The divergence D' shown in formula ( fTOb has the range ifTTI : 

< D'(b -*b')<l 

which means that: 

-1 < D'{b H -> & H ) - D'(b' H -^a H )<l 
QQ' = [-1,1] 

D. Proof of Theorem [7] 

Borrowing the same two arguments we used in the proof of 
Lemma Q] we obtain the dynamic upper bound of Q" which 
corresponds to the maximum informing flow: 

Q max (£, b' H ) = r) ■ [1 - log(l + b H (a H ))] 

and the dynamic lower bound of Q" which corresponds to the 
maximum misinforming flow: 

Q'mm(£, b H) = -V ■ l0g(l + b H ((J H )) 

This gives us the general range of flow reported by Q: 

QQ" = [-T] ■ log(l + b H {a H )), V ■ [1 - log(l + b H (a H ))]} 

E. Proof of Theorem |2] 
Assuming a flow of k bits gives us: 

Q"(£,b H ) = k 

i] ■ [- log(l + b H (a H )) + log(l + b H {a H ))] = k 
b' H (a H ) = 2 k /*> ■b H (a H )+2 k ^ -1 

F. Proof of Lemma 

h < b 2 

- log(l + b) + log(l + 61) < - log(l + b) + log(l + 6 2 ) 

Q"(£,h) < Q"(£,b 2 ) 

G. Proof of Theorem \3\ 

A fully uncertain attacker about the correct high state has a 
zero prebelief. An informing flow equal to the upper bound of 

Q": 

Q'maxi 8 ' h Ii) = V ■ [1 - 'o.g(l + b H {a H ))\ = T) ■ [1 - logl] = X] 

evolutes the attacker's knowledge, and transforms her prebelief 
into the following postbelief: 

b H {o H ) = 2 k '^ ■ b H (a H ) + 2 fc /" - 1 = 2"/ r ' -1 = 1 

This postbelief captures the attacker's full certainty about the 
correct high state. 

H. Proof of Theorem [4] 

The proof is essentially the same as the proof of Theorem [5] 
although it starts by a fully certain attacker about the correct 
high state. 



